Unrestricted Vocabulary Keyword Spotting Using LSTM-CTC

نویسندگان

  • Yimeng Zhuang
  • Xuankai Chang
  • Yanmin Qian
  • Kai Yu
چکیده

Keyword spotting (KWS) aims to detect predefined keywords in continuous speech. Recently, direct deep learning approaches have been used for KWS and achieved great success. However, these approaches mostly assume fixed keyword vocabulary and require significant retraining efforts if new keywords are to be detected. For unrestricted vocabulary, HMM based keywordfiller framework is still the mainstream technique. In this paper, a novel deep learning approach is proposed for unrestricted vocabulary KWS based on Connectionist Temporal Classification (CTC) with Long Short-Term Memory (LSTM). Here, an LSTM is trained to discriminant phones with the CTC criterion. During KWS, an arbitrary keyword can be specified and it is represented by one or more phone sequences. Due to the property of peaky phone posteriors of CTC, the LSTM can produce a phone lattice. Then, a fast substring matching algorithm based on minimum edit distance is used to search the keyword phone sequence on the phone lattice. The approach is highly efficient and vocabulary independent. Experiments showed that the proposed approach can achieve significantly better results compared to a DNN-HMM based keyword-filler decoding system. In addition, the proposed approach is also more efficient than the DNN-HMM KWS baseline.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting

Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...

متن کامل

Small-footprint Keyword Spotting Using Deep Neural Network and Connectionist Temporal Classifier

Mainly for the sake of solving the lack of keyword-specific data, we propose one Keyword Spotting (KWS) system using Deep Neural Network (DNN) and Connectionist Temporal Classifier (CTC) on power-constrained small-footprint mobile devices, taking full advantage of general corpus from continuous speech recognition which is of great amount. DNN is to directly predict the posterior of phoneme unit...

متن کامل

Keyword spotting in multi-player voice driven games for children

Word spotting, or keyword identification, is a highly challenging task when there are multiple speakers speaking simultaneously. In the case of a game being controlled by children solely through voice, the task becomes extremely difficult. Children, unlike adults, typically do not await their turn to speak in an orderly fashion. They interrupt and shout at arbitrary times, speak or say things t...

متن کامل

Comparison of keyword spotting methods for searching in speech

This paper presents and discusses keyword spotting methods for searching in speech. In contrast with searching in text, the searching in speech or generally in multimedia data still represents a challenge. The aim of the paper is to present a keyword spotting (KWS) method based on a large vocabulary continuous speech recognition (LVCSR) system, based on phonetics decoder, and keyword spotting u...

متن کامل

Improvement of rejection performance of keyword spotting using anti-keywords derived from large vocabulary considering acoustical similarity to keywords

This paper proposes an efficient anti-keyword derivation method to improve the rejection performance of keyword spotting. In this method, each anti-keyword is derived from the large vocabulary lexicon considering acoustic similarity to keywords, making use of the confusion matrix. Experimental results show that a 3 % improvement of the rejection rate is obtained compared to conventional methods...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016